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WHAT SHOULD BE TESTED IN THE FUTURE 
Gerald W. Bracey, Ph.D. 

ABSTRACT 

Some reasons are given for why^ at the current time^ there are few incentives for test 
publishers to make sighificarit'innovatidhs in what is tested or how it is tested. A brief 
discussion of research bri growth spurts in the braih^ hemispheric differences and other 
neurological pheridrrieha is followed by discussion of some cdhclusioris tbrit have been 
drawn from this work* While scepticism is exprjessed over the great inferential chasrris 
one must leap to arrive at some conclusions, hope is expressed that the field will 
ultimately prove fruitful in perrriittihg rribre sensitive assessment of individual children. 
Recent studies in cognitive psychology are discussed arid hope is expressed that these 
areas too, will lead to improved assessment although their current relevance to practice 
is not great. Finally, some areas of investigation that are currently being ignored are 
mentioned as being potential sources of useful evaluation. 
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Although I have had far less time than I had hoped to explore the libraries in 
preparation for this speech^ I think I did uncover sortie things to be cdhsidered about 
what ought to be tested in the future and how it ought to be tiested even though I will 
say those things with rhUch less brgahizatioh than I arh comfortable with and much more 
tentatively than I would prefer. 

Let me first discuss what I see as a major obstacle to any of the ihhdvatibns in testing 
practice that I see as desirable. The obstacle is that for test publishers^ there is little 
incentive to produce changes in tests other than perhaps broaden the scope of what is 
tested, in the sense of adding new curriculum areas and thus sell more tests. That Ift 
itself might hot be a bad idea depending on how the tests were conceived arid 
constructed: If it is true, as is so often alleged arid as I terid to believe^ that what Is 
tested is^ if riot what is taught, what is emphasized, then developing more 
comprehensive batteries in science, fine arts, foreign language would help drive 
ihstructidh, to use Jim Pdpham's phrase, in a healthy way. That would be fine as far as 
it goes, but urif brturiately it would riot go very far. The reasdri it would riot go very far 
is, that wheri you exparid the scope of testirig to iriclude more subjects and more 
objectives within a subject^ the techriblbgy bf testirig sbbri falls short. For example, we 
are develbpirig iri Virglriia an assessment cbmpbrierit for a set of learrier butcome 
objectives, k-twelve^ all subject areas frbrri scierice arid riiath tb fine arts arid physical 
education. We are thus faced with assessirig objectives such as 

The student will gain insight iritb the culture arid history of a people through the study 

bf literature. 

or 

The studerit will describe fSiysical^ chemical^ arid nuclear chariges Using the law of 
conservation of riiatter arid energy. 
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We have made a conscious decision that assessment strategies mdst be provided for all 
objectives, hot rherely those that lend themselves to easy assessment because of otir 
concern that we riot estabish a hierarchy of importance within the objectives by 
selecting only some for assessrrierit. We are thus faced with the developrrierit of some 
very creative assessmerit strategies. Believe rrie, we are not having an easy time of it. 

Additibrially, areas such as affective measures are badly ignored in assessment 
programs to the detrimerit of educatiori in general. The State Board of Education has 
established riine goals of public educatiori in the state of Virginia, but at this time, the 
assessment program address only two of them, those dealing with achievement. We are 
also moving iri these areas, but the going is slow. 

In spite of the creative work I see happening within the Departrrierit and a few other 
places, I thirik that my assertion coricerriing the disincentives for test innovations hold. 
I draw my cbnclusibri in part not only from the limited technology of testing but from 
Jbhri Gobdlatfs recent cbriclusiori that "in the how arid the what of what is taught^ a 
school is a school is a schbbl." Bbb Glaser had made a similar bbservation somewhat 
earlier. Gbbdlad noted that the schools were characterized by a low level of cbgriitive 
demand and cognitive response. Glaser mentioned the iriilexibility bf curricula 
treatments. Until there is a change iri what Sorbtriik, one of Gbodlatfs co-invesitgators, 
called the persistency, cbnsistericy and mediocrity of life iri classrbbrris, there seems 
little value in, little iricentive fbr test manufacturers developing better disgribstic^ 
prescriptive iristruments unless, again- those instruments cari be used to drive 
instruction. That is a big if, because I recall last year Eva Baker here wondering aloud 
if curriculum makers and test builders would ever talk to one anbther. 
I want to talk now abbut the implications of two areas of research: Testing 
consideratibris drawn from research tying brain function tb learnirig arid research 
stemming from investigations in cbgnitive psychology. 
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Implicatibris from the studies of the brain. 

This Is an area of study where I had been hearing a lot of second hand talk about what 
exciting things were going on but when I w^nt to the library cof board, 1 found not that 
It was bare but that it wasn*t very amply stocked, in 1978, the NSSE Yearbook was 
entitled the Brain- and- Education and it contains rnany articles by rriany of the major 
brain researchers in which many of them rnakc some rather interesting conjectures of 
the relationship of their work to education but that's what they are -conjectures. Most 
of the work revolves around Mactean's notion of a Triune brain, Sperry*s work Irivblvirig 
lateralization, and Epsteins work on growth spurts. Herrnan Epstein who has postulated 
theories of learning growth based on the studies of spurts in brain growth largely using 
brains of dead children asserts that Head Start was bound to fail because it did not 
occur during the period of a growth spurt. Someone, whose name escapes me now, has 
also postulated that since the growth spurt at age 11 seems to be twice as great for 
girls as for boys that perhaps we should consider sex segregated schooling during the 
middle school years. Many of the conjectures appear contradictory and after three 
hundred and seventy odd pages of this, editors Allan Mirsky and Jeanne Chall are left to 
say are in a chapter entitled implications for education, "Gee ain't this interesting but 
what does it all mean?" 

That does riot stop Mirsky and Chall however, from posing the following futuristic 
scenario: 
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The test battery of the twenty-first century woald be the 
responsibility of a team of specialists inclading the edacational 
nearoscientist. It would enco.npass behavioral and photographic 
analyses designed to identify motor patterns, cerebral dominance and 
reiated psycho- and physiomotor capacities; it might also include 
electrographic and sensory tests that would provide data about the 
relative maturity and efficiency of processing information in all 
relevant sensory modalities; Attentiohal capacities would be 
assessed by both behavioral and electrophysiological means, and the 
sources of attentional difficulties (if any) categorized and identified 
with respect to inter- as opposed to extra-cerebral causes. Brain 
size, maturity, and relative degree of myelinizatiori in key areas 
would be assessed by means of hbhihjuridus heuroradiblogical 
techniques. Oxygen utilization in various brain regions at rest and 
during a variety of rriental activities would be assessed by means of 
dynamic energy utilization techniques. Such rriethbds currently exist 
and need only to be refined further. Brain heurbhurribral balance and 
maturity would be assessed by means of bibcherriical assays 
performed on a few drops of blood and urine. Cbmputer-assisted 
analyses of these data would enable the educatibhal heurbsciehtist to 
perform accurate assessment of the child's developmental stage, his 
particular strengths and weaknesses, the instructional materials he 
would best be able to handle, and the problem areas that would rribst 
Hkely be encountered during his educatibhal career. 

Mlrsky and Ghall close the volume by saying: 
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As exciting as this Utopian aid to education may be in the 
twenty-first century or a few years earlier^ it rriUst be realized that 
it cannot be applied in the absence of that most effective arid 
essential of all educational forces — able, patient, and caring 
teachersw 



Work such as that just quoted, led Barbara Hutson c^f Virginia Tech^ in 
a piece entitled "Brain Based Curricula - Salvation or Snake Oil ?" to 
note that 

"WacLeari's work Is based on surgical experimentation with green 
lizards and squirrel monkeys, the work of Sperry and others is based 
on human split brain studies and noninvasive analogues, Epstein's 
work on human cadavers. The Mactean/Hart position credits us with 
three brains and three minds; the Sperry/Samples position credits us 
with two hemispheres and two minds; the Epstein/Toepfer position 
credits us with one mind which works — sometimes". 

She notes as well that many of the assertions require as yet to be demonstrated brain 
structures, as yet to be dernonstrated brain functions and/or as yet to be demonstrated 
iinakges between the first two and psychological processes. 

Despite the great inferential chasms one must leap from research on brain function to 
assessment, there is a general implicatiorj here which I found in all of my research 
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whether I started with brain based research, developmentai psychology, cognitive 
psychology^ policy studies^ of what: The research all implies sizable differences in 
iearhihg styles, growth rates, perceptual preferences, information processing etc, which 
imply that both instruction and assessment sFould be more tailored to the individual 
than they currently are or probably can be in the immediate future; 

Earlier today you heard Dick Schtltz ssy "forget about 
Aptitude-Treatment-Interaction". I am tempted to say "Forget Dick Scliutz on ATI". 
Everything that I read implies that it ought to be there, although we rnight wish to call 
it "Learning Style Treatment' Interaction" or "tateraiization Treatment Interaction". 
Thbre is a real question as to how much of any variance the differences in style or 
lateralization would account for as opposed to the communalities shared by all humans, 
but I wbuld be loathe to dismiss ATI at this time. 

While I poked fun at the Mirsky-Chall description of a twenty-first century test 
battery, I don't think it preposterous to consider using the emerging techhblbgy to 
provide mnch more sensitive instrument for each child, ^nd I don't just mean in terms 
of difficulty level. Gagne, Sternberg and others have posed that children must learn 
certain skills to a certain level of automaticity. If this be true then it doesn't seem like 
a tremendous technological problem to have a computer prvigram with a realtime clock 
that can assess reaction time or even the amount of time spent on different parts of the 
pfbblem. In fact, Sternberg has obliquely recommended such ia*:ency measures and last 
week I saw a program built for an SK computer that contained tv/o real-time clocks. If 
you embed two realtime clocks in an SK machine, the technological problems in 
assessing latency and other temporal aspects of problem solving do not seem great at 
all. 

6 
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There are other studies, derived from cognitive science^ cognitive psychology that may 
also require the assistance of rriicrbprbcessbr technology and it is to these that I now 
turn. Some years agb^ Arihe Ahastasi noted that increasing specialization had led to a 
cbhcehtratloh Upon techniques for test construction without sufficient consideration of 
psychblbgical research for the ihterpretatibh of test scores. There is no real "theory" 
underlying Item Respbhse Theory except a theory of test construction which becomes^ 
in the ehd^ a thebry bf technique hbt of substance . More recently, Bob Glaser has 
affirmed that thebries bf learning have been ignored by test developers, but has also 
noted that until recently^ theories of learriirig were based on experiments contrived to 
fit the cbhveriiehce bf the experimenter - and, I wbUld add, "^Hey continue to fit the 
reward structure bf the universities^ and that because of this the ap-licatibn bf learning 
thebry to real life, Ibng term learning and the develbprneht of competence or even 
expertise has been relatively minimal. 

The current Zeitgeist with its emphasis on competence, excellence, and expertise and 
higher order skills such as analysis, synthesis, and problem solving has turned bur 
attehtidh to the assessment of such skills and, as with the research on brain and 
learnings I find bh closer examination that we don't know as much as we ought to about 
hbw to assess such skills or even how to describe them. 

Some bf the research in cognitive science derives from what are called "expert 
systems" usually in medicine and mathematics. Such systems are developed as 
compute^r prbgrarhs tb make explicit the rules for problem solving that are implicit or 
tough tb see when hUmans do them. Such systems can be used for diagnosis in medicine 
and in the instance of physics for the development of sets of rules to solve problems. 

One system developed by Gordon Novak found that problems in physics textbooks which 
appear tb call Ibr the Use bf a couple of equations actually called for ten or twelve and 
that the laws bf physics heeded to solve the proulems and explanation of how to solve 
the problems presented in the textbook7was totally inadequate or even missing 
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altogether* Thas, m order to solve the probiems satdents had to use hidden laws arid 
eqaattons; Orie reason, it Is alleged that students are "bad'' at science is that the 
iristructiorial materials they use never provide them with the materials necessary to 
understand well the fundamental concepts or to solve the problems presented them. 
The implication is clear from some of the work using expert systems, that we will need 
to assess riot just what a child knows in a summative way^ but assess what he knows 
about a coricept and how this kribwledge ririay show furidariierital rhisconceptions. Torn 
Roriiberg will exparid bri this iri his preseritatibri. 

I 

i 

A secbrid set bf researches has lobked at the differerices iri hbw experts approach a 
problem and how a ribvice does. Novice is defined in different ways deperiding on the 
study. Iri one iristarice for example, a novice was someone who had completed a cbllege 
cbufse in physics while ah expert was a graduate studerit iri physics. Preseriteo with a 
set of physics problems, ndvices tended to react to super^'cial qualities bf the probiem 
(e.g.^ these all involve inclined planes) while experts tended to invbkve the underlying 
principles (e.g.^ these all can be sblved using Newtbri's secbricf law). Clearly, what heeds 
to be really assessed here> what is really impbrtarit here^ is ribt only how does a novice 
ap>prbach a prbblerii ih contrast tb ah expert or a less cbriipeterit persbri as oppbsed tb a 
more cbmpeterit persbri^ j but how tb we assess the process frbrii novice tb expert^ how 
do we assess the develbprrierit bf cbnipeterice. If this souhds a little like it requires a 
ririerger of iristructibh arid assessrrierit^ that is rib acciderit. The schisrri between 
iristructiori arid assessriierit has certairily been as dariiagirig as that between cducatibri 
arid psychblbgy arid we should begiri tb lobk at iristructibriy'assessmerit as iritegral piarts 
bf a single prbcess. 

The studies of expert-nbvice behavior suggest, in part, that the difference betweeri 
rriore and less expert people is iri their atfility tb organize information iri long term 
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memory in such a way that makes it readily accessible for a variety of purposes; Thus 
We heed to assess there kinds of storage and retrieval arid generalization capabilities; 
A problem of course^ is that most assessments that take place in classrooms are so sar 
removed from any definable behavior as to make their analysis into information 
processing corripbhehts virtually impossible. 

There are a few researches that are more immediately cogent to elementary and 
secondary educatibh. John Seely Brown and his colleages at Xerox' PARC have 
develo,:)ed a computer program that analyzes "bugs" in k:ds arithmetic. That is, tt 
systematically Ibbks at patterns of errors, not just counts rights and wrongs* Baiiding 
oh this wbrk^ sbrriebhe whbse reference has been lost to me has pointed oat that it is 
much more efficient, even necessary tb point but to the child the "bug" in his problem 
solving strategy, hbt just note that he got it wrong. This pointing out of errors only 
tends tb be rribre of what happens in schools or on standardized tests* 

One intriguing study looked at the errors made in writing passages of students who 
made a large number of errors in writing. Most teachers would probably, given the 
ehbrmbus number of errors, be able only to send the child for some kinds of 
remediation which might or might not match the kids' problems, because a systematic 
analysis bf errbrs in writing as in mathematics shows that there are patterns of 
mistakes which can be placed into categories while remediatioh would probably consist 
of a predetermined set of procedures adopted by the school system. . 
Frh certain that a computer program for such analyses is some ways off and one will 
probably be required as the analyses of error patterns is a rather iaborioas task; An 
ihtriguing aspect tb these researches: When the kids were allowed to read, aloud their 
own badly written passages, they often spontaneously corrected most of the errors. 
Xhe error prbdUcihg problem appears to be something other than that the child doesn't 
understand the structure bf the language. 

9 
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Again, the utility of approaches to error pattern analysis is a ways down the- road, but I 
think it bears watching. It really goes back to the old dictum derived from Piaget that 
a chiltfs errors, if adequately exarnined, rnay tell you much more about what his 
cognitive level is than his correct responses. 

There is one area of research that I have not yet had a chance to look at. Mucn activity 
in Russia, so I understood has focused on the old concept of Lev Vygotsky of the "zone 
of proximal developrnent" which is now usually called the "zone of potential 
development*'. The thrust of ■ these researches is to find ways of not only assessing 
where a chid Is In cognitive development but what his potential level is. Knowing these 
two facts would allow a tailoring of instruction that would hot be too far above the 
level of present developrnent, but aimed toward the potential. I don't recall Piaget 
using the phrase but in his terms, I think we could call this the "zone of potential 
accomodation". 

Although this may be a tangent oh what we think of in assessment, I think we need to 
begin a much more comprehensive assessment of the Instructional materials in classes 
as well as kids. Most textbooks that I have seen are awful. Ditto most software. And 
when I have queried softNvare developers on how they know that something Is good, they 
often offer the retort that "You don't ask that of textbooks". No, we haven't but we 
damn welt ought to. An intriguing line of research has been opened up by Thomas 
Malone in his analysis of what makes videogames intrinsically motivating. His findings- 
-the challenge, the curiosity, and even the ambiguity of the rules Initially are part of 
the motivationg aspect. Similar findings were reported at a conference on videogames 
and their implication for education held at the Harvard School of Education. Even 
allowing for the fact that the conference was funded by Atarl^ the researchers seemed 
to find much to gain from videogames f bq(jhstrUctibhaI materials. I am convinced. 
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from my own experiences with rriy Apple^ that some of the games require such intense 
cbhceritratibh and sustained attention that those abilities may in fact be faciiitated. 
for one, am tired of reports like the Bell Commission report which resulted in calls to 
get tough on kids which usually means get boring arid get punitive. Why hot get 
challenging, get exc-ting? In any case, it will benefit Us little to know all there C about 
information processing, brain functioning, problem solving if the materials in 
classrooms do little or nothing to evoke or enhance these processes. 

Finally, there is one area of irivestigatibri that I see rib brie wbrkirig bri right ribw. It is 
an esseritial bf adaptive intelligence to be able to deal with uncertainty in a situation; 
But I dori*t kribw bf any investigators who are working in the area of finding out what 
dbes a person db when he doesn't kribw what to do ? Given a hovel situation, how does a 
person go about decidirig what to db hext ? Hbw does he decide how to gather 
iriforrriatibh and which information to gather ? These kinds of skills, at least on the 
surface, seem more Important in an information society than in previous ones. That is 
because, as I painfully discovered enroute to this paper, the information expiposion has 
prbduced a concomitant ignorance explosion - somehow the electronic marvels I an 
others posess lead to more and more papers on more and more topics and I find that I 
know less and less about more and more. I understand that there are some 6006 to 7000 
scientific reports published every day and certainly at least three of four of them are 
worth reading. Thus we need to asses how a child learns a) to separate the 
informational wheat from the informational chaff and b) how a ctiild copes with a novel 
situation. Given the Goodlad conclusion that a school is a school is a school, however, 
the child almost never finds himself in a novel, information sorting situation there. 
Sternberg appears to have done some preliminary work in this area in what he calls 
executive processes that organize, plan and monitor behavior^ bat his work is relatively 
primitive and it appears to me at this time that little work has been done since the 
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topic was addressed by Miller^ Salanter arid Pribrarii in their 1962 book Plans and the 
Stroctaj:!e_i^Behavlor. 

Some of the topics I have covered seerri rerhote ffbrh imrriediate practice and some of 
therri are. I hope you will give them sortie consideration however, because, to me at 
least, they imply a much richer fbrrriUlatldn of testing practice than is currently 
available. 
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